|
|
|
|
|
|
|
Quality issues |
|
Nonexecution-based testing |
|
Execution-based testing |
|
What should be tested? |
|
Testing versus correctness proofs |
|
Who should perform execution-based testing? |
|
When testing stops |
|
|
|
|
|
|
|
Two types of testing |
|
Execution-based testing |
|
Nonexecution-based testing |
|
|
|
|
|
|
“V & V” |
|
Verification |
|
Determine if the phase was completed correctly |
|
Validation |
|
Determine if the product as a whole satisfies
its requirements |
|
|
|
|
|
Warning |
|
“Verify” also used for all nonexecution-based
testing |
|
|
|
|
|
|
Not “excellence” |
|
Extent to which software satisfies its
specifications |
|
Software Quality Assurance (SQA) |
|
Goes far beyond V & V |
|
Managerial independence |
|
development group |
|
SQA group |
|
|
|
|
|
|
|
Underlying principles |
|
We should not review our own work |
|
Group synergy |
|
|
|
|
|
4–6 members, chaired by SQA |
|
Preparation—lists of items |
|
Inspection |
|
Up to 2 hours |
|
Detect, don’t correct |
|
Document-driven, not participant-driven |
|
Verbalization leads to fault finding |
|
Performance appraisal |
|
|
|
|
|
Five-stage process |
|
Overview |
|
Preparation, aided by statistics of fault types |
|
Inspection |
|
Rework |
|
Follow-up |
|
|
|
|
Recorded by severity and fault type |
|
Compare with previous products |
|
What if there are a disproportionate number of
faults in a module? |
|
Carry forward fault statistics to the next phase |
|
|
|
|
|
82% of all detected faults (IBM, 1976) |
|
70% of all detected faults (IBM, 1978) |
|
93% of all detected faults (IBM, 1986) |
|
90% decrease in cost of detecting fault
(Switching system, 1986) |
|
4 major faults, 14 minor faults per 2 hours
(JPL, 1990). Savings of $25,000 per inspection |
|
Number of faults decreased exponentially by
phase (JPL, 1992) |
|
Warning |
|
Fault statistics and performance appraisal |
|
|
|
|
Fault density (e.g., faults per KLOC) |
|
Fault detection rate (e.g., faults detected per
hour) |
|
By severity (major/minor), by phase |
|
What does a 50% increase in the fault detection
rate mean? |
|
|
|
|
|
Definitions |
|
Failure (incorrect behavior) |
|
Fault (NOT “bug”) |
|
Error (mistake made by programmer) |
|
Nonsensical statement |
|
“Testing is demonstration that faults are not
present” |
|
|
|
|
|
“The process of inferring certain behavioral
properties of product based, in part, on results of executing product in
known environment with selected inputs.” |
|
Inference |
|
Known environment |
|
Selected inputs |
|
But what should be tested? |
|
|
|
|
|
Does it meet user’s needs? |
|
Ease of use |
|
Useful functions |
|
Cost-effectiveness |
|
|
|
|
|
Frequency and criticality of failure |
|
Mean time between failures |
|
Mean time to repair |
|
Mean time, cost to repair results of failure |
|
|
|
|
|
Range of operating conditions |
|
Possibility of unacceptable results with valid
input |
|
Effect of invalid input |
|
|
|
|
|
|
Extent to which space and time constraints are
met |
|
Real-time software |
|
|
|
|
Incorrect specification for a sort |
|
|
|
|
|
|
|
|
|
|
|
Function trickSort which satisfies this
specification: |
|
|
|
|
|
Incorrect specification for a sort: |
|
|
|
|
|
|
|
|
|
|
|
|
|
Corrected specification for the sort: |
|
|
|
|
NOT necessary |
|
NOT sufficient |
|
|
|
|
Alternative to execution-based testing |
|
|
|
|
Code segment to be proven correct |
|
|
|
|
Flowchart of code segment |
|
|
|
|
|
|
Never prove a program correct without testing it
as well |
|
|
|
|
|
1969 — Naur Paper |
|
“Naur text-processing problem” |
|
Given a text consisting of words separated
by blank or by nl (new line) characters, convert it to line-by-line form in
accordance with following rules: |
|
(1) line breaks must be made only where
given text has blank or nl ; |
|
(2) each line is filled as far as
possible, as long as |
|
(3) no line will contain more than maxpos
characters |
|
Naur constructed a procedure (25 lines of Algol
60), and informally proved its correctness |
|
|
|
|
|
1970 — Reviewer in Computing Reviews |
|
The first word of the first line is preceded by blank
unless the first word is exactly maxpos characters long |
|
|
|
|
|
1971 — London finds 3 more faults |
|
Including: |
|
The procedure does not terminate unless a word
longer than maxpos characters is encountered |
|
|
|
|
|
1975 — Goodenough and Gerhart find three further
faults |
|
Including: |
|
The last word will not be output unless it is
followed by blank or nl |
|
|
|
|
Lesson: |
|
Even if product is proved correct, it must STILL
be tested. |
|
|
|
|
Software engineers do not have enough math for
proofs |
|
Proving is too expensive to be practical |
|
Proving is too hard |
|
|
|
|
Can we trust a theorem prover ? |
|
How to find input–output specifications, loop
invariants |
|
What if the specifications are wrong? |
|
Can never be sure that specifications or a
verification system are correct |
|
|
|
|
|
Correctness proofs are a vital software
engineering tool, WHERE APPROPRIATE. |
|
If |
|
Human lives are at stake |
|
Indicated by cost/benefit analysis |
|
Risk of not proving is too great |
|
Also, informal proofs can improve the quality
of the product |
|
Assert statement |
|
|
|
|
|
Testing is destructive |
|
A successful test finds a fault |
|
Solution |
|
1. The programmer does informal testing |
|
2. SQA does systematic testing |
|
3. The programmer debugs the module |
|
All test cases must be |
|
Planned beforehand, including expected output |
|
Retained afterwards |
|
|
|
|
|
|
Only when the product has been irrevocably
retired |
|